Perceptually Salient Regions of the Modulation Power Spectrum for Musical Instrument Identification
نویسندگان
چکیده
The ability of a listener to recognize sound sources, and in particular musical instruments from the sounds they produce, raises the question of determining the acoustical information used to achieve such a task. It is now well known that the shapes of the temporal and spectral envelopes are crucial to the recognition of a musical instrument. More recently, Modulation Power Spectra (MPS) have been shown to be a representation that potentially explains the perception of musical instrument sounds. Nevertheless, the question of which specific regions of this representation characterize a musical instrument is still open. An identification task was applied to two subsets of musical instruments: tuba, trombone, cello, saxophone, and clarinet on the one hand, and marimba, vibraphone, guitar, harp, and viola pizzicato on the other. The sounds were processed with filtered spectrotemporal modulations with 2D Gaussian windows. The most relevant regions of this representation for instrument identification were determined for each instrument and reveal the regions essential for their identification. The method used here is based on a "molecular approach," the so-called bubbles method. Globally, the instruments were correctly identified and the lower values of spectrotemporal modulations are the most important regions of the MPS for recognizing instruments. Interestingly, instruments that were confused with each other led to non-overlapping regions and were confused when they were filtered in the most salient region of the other instrument. These results suggest that musical instrument timbres are characterized by specific spectrotemporal modulations, information which could contribute to music information retrieval tasks such as automatic source recognition.
منابع مشابه
Musical Timbre and Emotion: The Identification of Salient Timbral Features in Sustained Musical Instrument Tones Equalized in Attack Time and Spectral Centroid
Timbre and emotion are two of the most important aspects of musical sounds. Both are complex and multidimensional, and strongly interrelated. Previous research has identified many different timbral attributes, and shown that spectral centroid and attack time are the two most important dimensions of timbre. However, a consensus has not emerged about other dimensions. This study will attempt to i...
متن کاملAutomatic Timbral Morphing Of Musical Instruments Sounds By High-Level Descriptors
The aim of sound morphing is to obtain a result that falls perceptually between two (or more) sounds. In order to do this, we should be able to morph perceptually relevant features of sounds instead of blindly interpolating the parameters of a model. In this work we present automatic timbral morphing techniques applied to musical instrument sounds using high-level descriptors as features. High-...
متن کاملFull-Band Quasi-Harmonic Analysis and Synthesis of Musical Instrument Sounds with Adaptive Sinusoids
Sinusoids are widely used to represent the oscillatory modes of musical instrument sounds in both analysis and synthesis. However, musical instrument sounds feature transients and instrumental noise that are poorly modeled with quasi-stationary sinusoids, requiring spectral decomposition and further dedicated modeling. In this work, we propose a full-band representation that fits sinusoids acro...
متن کاملMusic Instrument Identification Using MFCC: Erhu as an Example
In the analysis of musical acoustics, we usually use the power spectrum to describe the difference between timbres from two music instruments. However, according to our experiments, the power spectrum cannot be used as effective features for erhu instrument identification. In this paper, we use MFCC (mel-scale frequency cepstral coefficients) as features for music instrument identification usin...
متن کامل2 pMU 9 . Musical instrument identification : A pattern - recognition approach
A statistical pattern-recognition technique was applied to the classification of musical instrument tones within a taxonomic hierarchy. Perceptually salient acoustic features— related to the physical properties of source excitation and resonance structure—were measured from the output of an auditory model (the log-lag correlogram) for 1023 isolated tones over the full pitch ranges of 15 orchest...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 8 شماره
صفحات -
تاریخ انتشار 2017